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INTRODUCTION 


Tht«  report  is  an  evaluation  of  the  feasibility  of  predicting  the 
reliability  of  electronic  systems  through  the  analysis  of  failure  rates 
at  accelerated  environmental  conditions.    It  contains  four  major  sections 
describing  (I)  some  basic  fundamentals  of  reliability  theory,  (2)  methods 
of  predicting  reliability,  (3)  various  types  of  accelerated  testing,  and 
{k)  a  typical  example  of  predicting  the  Mean  time  between  failures  of 
an  airborne  electronic  system.    It  Is  the  specific  intent  of  the  author 
to  validate  procedures  for  determining  the  mamn  time  between  failures  of 
an  electronic  system  under  accelerated  thermal  stresses,  in  an  effort  to 
predict  the  mean  time  between  failures  of  the  same  system  under  use  con- 
ditions.   Bact(ground  information  is  supplied  to  acquaint  the  reader  with 
definitions,  distributions  and  data  sources  presently  being  used  by 
various  manufacturers  and  government  agencies. 

Tha  absolute  necessity  for  reliability  in  complex  mi  I i tary  systems 
has  become  the  most  Important  reason  for  the  multftuda  of  studies  about 
reliability  in  recent  years.    The  space  program  with  Its  demand  for  accel- 
eration has  created  a  need  for  reliability  prediction  with  a  high  degree 
of  accuracy.    Unreliability  and  its  accompanying  cost  can  best  be  dee 
scribed  by  Lt.  General  Howell  M.  Estes  (S)  the  Vlce-Ccnmander  of  the  Air 
Force  Systems  Coninand. 

Halntenance  of  military  electronics  equipment  now  ranges 
between  60  end  i ,000  times  the  initial  costs.    The  progress 
we  have  made  today  in  system  reliability  has  simply  not  been 
edequate.    To  cite  some  examples— a  failure  of  a  2  dollar  Item 
In  the  launch  of  a  space  system  caused  the  loss  of  a  2.2 
million  dollar  vehicle.    The  failure  of  a  5  dollar  thermal 
shield  resulted  in  a  23  million  dollar  disaster.    A  failure 
of  a  25  dollar  fuel  valve  In  a  ballistic  missile  brought  about 
a  loss  of  22  million  dollars. 
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The  costs  related  here  pertained  to  money  losses,  but  in  the  space 
race  losses  must  include  hunan  lives.    The  need  for  iciproved  ret i ability 
prediction  becomes  obvious  when  c<Mistructing  a  vehicle  for  a  voyage 
into  space  tasting  several  years. 

FUNDAMENTALS  OF  RELIABILITY 

Reliability  has  been  defined  In  many  ways,  but  the  most  widely  used 
Is  this,  "Inherent  reliability  Is  the  probability  that  the  equipment 
will  perform  its  intended  function  satisfactorily  for  a  specified  period 
of  time  when  used  in  the  manner  and  for  the  purpose  intended."  (10) 
Satisfactory  performance  of  the  system  Is  considered  to  be  operation 
within  specified  functional  characteristic  limUs.    Also,  satisfactory 
performance  is  considered  synonymous  with  success  or  non-failure  and  un* 
satisfactory  performance  constitutes  failure. 

The  relationship  between  part  and  system  failure  is  an  essential 
part  of  reliability  prediction  and  will  be  loolced  at  briefly  here.  One 
of  the  basic  problems  in  predicting  reliability  of  a  systan  Is  determining 
the  expected  reliabilities  of  the  individual  parts—as  they  are  applied 
in  the  system.    Having  a  detailed  description  of  how  a  proposed  system 
will  be  employed  in  a  typical  mission,  the  logical  structure  of  system 
operations  can  be  developed.    This  structure  should  lintc  together  the 
probebi titles  of  successful  operation  of  all  parts  into  an  expression 
giving  such  probability  measure  for  the  system  as  a  whole.    In  this  pro- 
cess, which  is  called  "constructing  a  mathematical  mode)  of  tlie  system", 
the  effects  of  Interactions  of  the  various  parts  on  each  other,  may  be 
estimated  from  the  design.    The  overall  system  can  be  divided  Into 
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sub-systams  whosa  raUablltty  functions  are  to  ba  statistically  lnda-> 
pandant  of  aach  othar.    Thase  sub-systons  can  again  ba  dividad  Into 
assamblias  of  parts  whosa  Individual  rattabi lltias  tvava  baan  datamlnad. 
From  this  It  Is  evident  that  In  any  study  of  system  ra1liri>nity  It  U 
•ssantlal  to  datarnlne  failure  characteristics  of  individual  parts. 

Failure  Characteristics 

To  flMMiine  tha  manner  in  which  part  failure  occur,  two  categories 
of  failures  will  be  discussed.    The  first  of  thase  are  performance  dag* 
radatlon  failures,  and  the  second  are  random  catastrophic  failures.  An 
example  of  tha  first  type  would  be  an  electron  tube  vidwsa  transconduc- 
tanca  has  diminished  to  the  point  of  failure  from  a  bul Id-up  of  inter- 
face resistance.    The  second  type  of  failure  is  exemplified  by  tubas 
which  have  become  inoperative  because  their  heaters  iiava  opened. 

There  are  t%«>  alternatives  to  tha  usa  of  analytical  techniques  in 
predicting  degradation  failures.    (21)    The  most  popular  Is  to  Ignore  tham 
and  assuna  that  they  ara  a  nagligibie  portion  of  total  failures.  This 
4q>proach  holds  credence  because  many  degradation  failures  can  be  elirot* 
nated  as  a  result  of  modern  conservative  design  and  such  practices  as: 

1.  A  design  review  of  aach  circuit  to  be  enq>loyed  in  a  new  equip- 
ment and  subsequent  improvement  of  the  circuit. 

2.  Type  testing  on  a  professional  level  aimed  at  performance  im- 
provement as  well  as  equipment  certification. 

3*   lUintananca  practices  designed  to  eliminate  those  parts  approach- 
ing wear-out  before  they  fall. 

Tha  other  approach  Is  to  assume  that  in  naw  systems  degradation 
failures  will  represent  the  same  proportion  of  total  failures  as  they  did 


in  previous  systems.   Many  data  sources  Include  degradation  failura  rates, 
but  if  this  information  is  not  included  adjustments  must  be  considered. 

In  the  area  of  catastrophic  failure  anaiysis*  there  are  two  related 
conputational  techniques  involved  which  employ  the  exponential  failure 
law.    One  method  (referred  to  as  "part  count")  employs  parameters  which 
provide  part  failure  rate  by  component  category.    This  method  is  based  on 
a  canq>lete  part  count  to  which  a  single  overall  average  failure  rate  is 
applied.    The  second  computational  method  has  been  used  effectively  by 
several  groups  and  includes  a  greater  degree  of  design  detail.    This  method 
entails  the  following  steps:  (21) 

1.  The  identification  of  each  individual  part  In  terms  of  Its 
fani ly-type,  characteristics,  controlling  specifications,  etc. 

2.  A  decision  as  to  applicability  of  avaltabte  statistical  guides 
followed  by  a  choice  of  satisfactory  charts  and  curves  or  modification 
factors. 

3.  A  determination  of  the  equivalent  sustained  (electrical  and  am*>Snct 
blent)  stresses  applied  to  each  part. 

k.    Entry  into  each  appropriate  figure  for  determination  of  the  re* 
tultant  failure  rate  for  each  part. 

5*    Addition  of  all  hazards  as  effective  failure-rate  terms  to  de- 
rive a  "grand  total"  failure-rate  term  for  the  system. 

The  process  for  determining  random  catastrophic  failure  rates  It 
based  on  the  premise  that  like  parts  have  approximately  the  same  relia- 
bility in  <Kie  system  as  In  any  other  system.  If  they  are  subjected  to  the 
sane  stresses.    In  order  to  standardize  data  information  and  establish  a 
failure  rate  data  exchange  program  the  Sureau  of  Naval  Weapons  instigated 
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th«  FARAOA  Program.    The  FARAOA  Program  currant ly  rapresents  the  most 
comprehensive  source  of  failure  rate  data  and  has  as  its  objectives  the 
following:  (20) 

1.  To  derive  coherent  baste  failure  rates  that  represent  the  com- 
posite experience  of  the  various  program  participants  without  losing  the 
Identity  of  the  basic  loqiut  data. 

2.  To  present  the  failure  rate  InfonMtlon  In  a  convenient  fonn  for 
us«  by  design  engineers. 

3.  To  convey  with  each  data  entry  the  leva!  of  confidence  th«  u««r 
may  validly  attach  to  the  given  failure  rata. 

k.    To  extend  the  present  failure  rata  Information  to  include  all 
parts  for  which  data  are  presently  available  or  will  be  available  In  the 
foreseeable  future. 

S.    To  expand  and  update  the  current  information  on  the  effect  of 
environmental  stress  factors  on  part  and  component  failure  rates. 

This  final  objective  on  the  effect  of  environmental  stresses  Is 
probably  the  most  essential  Item  in  the  prediction  of  reliability.  FAiMOA 
hat  «ada  It  a  raqulrawHit  to  furnish  the  exact  environmental  conditions 
under  which  a  particular  set  of  failure  data  was  determined  In  order  to 
correlate  results  from  various  sources. 

Although  some  environmental  conditions  are  Impossible  to  coniplataly 
simulate,  reliable  predictions  necessitate  a  study  of  use  conditions. 
Some  of  the  many  environmental  conditions  FAftAM  has  baan  concerned  with 
are  listed  here:  (20) 

1.  Percent  of  ilated<» 

,      Voltage,  Frequency,  Currant,  Power 

2.  Temperature- 

High,  Low,  Typical 


3.  Vibration— 

M«chanica}-*Type  and  Frequency 
Acoustic—Intensity  and  Frequency 

i».  Shock— 

Maxtffluro  Intensity,  Typical  Duration,  Frequency  of  Occurence 

5.  Pressure— 

T^lcal,  Range 

6.  Relative  Hunldity— 

Typical ,  Rang* 

7.  Radiation— 

Total  Absorption,  Typo 

In  any  corap Iok  space  electronic  systan  all  of  the  above  stresses  will 
be  encountered  and  must  be  conslderwj  when  datamiinlng  failure  rates. 

Catastrophic  failures  can  be  caused  by  any  of  the  above  stress  fac* 
tors,  but  in  nost  well-dasigned  •quipnants,  the  principal  factors  ara 
electrical  and  tImvMl  ttretaet.    In  the  axampta  utillzad  later  In  thit 
rq[)ort  only  thermal  stresses  will  be  considered. 

Failures  of  electronic  parts  can  be  more  fully  understood  by  con- 
sider! ng  their  failure  rate  density  curves. 


Failure  Rate  Density  Curve 


The  probability  -  density  curve  as  shown  In  the  following  figure 
consists  of  three  failure  stages.  (13) 


The  first  of  these  stages  constitutes  early  failures  and  begins  at 
time  T"»0.    The  population  will  initially  have  a  high  failure  rate  due  to 
primary  material  failure  or  to  poor  quality  control.    As  these  weak  com- 
ponents fall,  the  failure  rate  decreases  fairly  rapidly  and  this  is 
known  as  the  "burn-in"  or  "debugging"  period.    This  early  failure  sit- 
uation is  characterized  by  a  conditional  failure  function  which  is  some 
form  of  the  negative  exponential  distribution.    Modern  engineering  tech- 
niques require  a  "debugging"  or  "burn-In"  period  before  the  parts  are 
accepted  for  equipment  assembly.    This  practice  eliminates  the  substand- 
ard components  and  is  essential  In  the  case  of  missiles,  rocl<ets  or  space 
vehicles  where  replacement  is  difficult  after  launch.    Unfortunately,  the 
cost  of  complete  "debugging"  for  parts  manufacturers  Is  extremely  high 
and  there  still  exists  a  resistance  to  requirements  demanded  by  military 
contractors  as  evidenced  by  the  following  statement  of  Robert  C.  Sprague, 
Chairman  of  the  Board  of  Sprague  Electric  Company:  (23) 

Consider  for  example,  the  problems  that  will  be  en- 
countered by  the  parts  manufacturer  when  10%  to  20%  of  life 
test  samples  must  be  tested  for  over  10,000  hours.  The 
management  problems  associated  with  test  equipment,  personnel, 
and  data  recording  increase  many  fold  over  present:  requi re- 
ments.    For  the  components  supplier,  the  cost  of  qualifying 
his  product  to  one  of  the  several  important  specifications 
systems  for  highly  reliable  parts  has  been  estimated  to  run 
between  $230,000  and  $500,000  which  i  thinic  is  a  conservative 
estimate. 

After  time  Tj  on  the  failure  probability  density  curve  the  period 
of  *Si4ear-out"  failures  begins.    Failures  talking  place  during  this  peri- 
od are  generally  caused  by  material  or  dimensional  changes  due  to  fatigue, 
material  migration,  chemical  reactions,  and  other  similar  phenomena. 

According  to  Bucldand,  (6)  the  failure  frequency  distribution  during 
wear-out  is  represented  most  often  by: 


1.  Th«  Walbull  fml  ly  of  distributions. 

2.  GaaM  dittrliHJt Ion. 

3.  Normal  or  Gaussian  distributions. 

Tiia  ctwrt  on  tha  following  page  compares  these  distributions. 

Again  accepted  reliability  concepts  require  replaconent  of  parts 
before  ti>e  wearout  period  begins.    However,  if  we  had  a  large  system  with 
many  components  in  series  and  chance  failures  are  absent  so  that  only 
wearout  failures  occur,  a  constant  failure  rate  will  evolve  and  the  systsa 
will  behave  exponentially.    As  a  general  rule  waarout  failures  must  be 
prevented  by  early  replacement  of  each  part  with  a  part  free  of  early 
failures  in  order  to  attain  high  system  reliability. 

The  stage  of  failures  from      to      will  be  the  primary  area  of 
consideration  In  this  report  and  Is  generally  assumed  to  raprasent  the 
life  of  the  part.    To  better  understand  this  stage  of  the  curve,  relia* 
blllty  functions  and  failure  patterns  will  be  discussed. 

A  reliability  function  is  defined  as  a  mathematical  formula  relating 
the  probability  that  the  system  will  o^rata  satisfactorily  with  a 
specific  period  of  time.    The  nature  of  this  relationship  is  dependant 
on  the  distribution  of  times  to  failure  of  a  particular  part  and  theo- 
retically could  be  of  any  fom.    However,  most  sources  agree  that  failure 
patterns  can  be  represented  by  a  relatively  small  number  of  distribution 
types.    The  types  most  commonly  encountered  are  (I)  the  norma  I  or 
Gaussian,  and  (2)  the  exponential  which  is  a  Special  case  of  (3)  the 
Ueibull.    (26)  p.  137. 

The  exponential  distribution  will  be  utilized  in  the  example  In  this 
report  as  it  is  In  general  acceptance  as  indicated  in  the  AGREE  Report:  J 
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"FI«Id  iRMsuranents  of  military  and  conniierclal  electronic  systams  have 
demonstrated  that  In  general  the  rata  of  tyatero  failure  1$  fairly  con- 
stant throughout  the  life  of  the  system." 

This  means  that  the  observation  of  a  large  population  of  systems 
has  shown  that  any  system  chosen  at  random  can  be  expected  to  operate 
satisfactorily  approximately  the  same  length  of  time  after  being  re« 
paired  as  It  operated  before  falling.    The  AGREE  R9fmri  also  defines  the 
life  of  a  system  as  the  period  during  which  It  falls  at  a  constant  rate. 
If  systaat  in  general  fall  at  a  constant  rate  then  the  probability  density 
function  (or  failure  frequency  function)  vi^lch  accurately  describes  a 
system's  performance  Is  the  negative  exponential.  (I),  p.  79* 

f  (t)  -  -X  eKp  (-Xt) 
The  failure  distribution  function  resulting  from  f  (t)  Is: 

r  (t)  «   y    f  (t)dt  -  l-exp  (-"^t) 
consequently,  the  reliability  function  Is: 

R  (t)  -    ^  f  (t)dt  -  axp  (-Xt) 
The  exponential  density  function  lllce  other  statistical  density 
functions  has  a  characteristic  value  called  the  mean.    This  is  obtained 
for  all  distributions  by  forming  what  is  called  the  first  moment  t  •  f(t) 
of  the  density  functicm  and  Integrating  over  the  range  of  f(t).  This 
operation  on  tha  exponential  density  function  determines  the  mean  of  the 
function  or  the  mean  time  between  failures. 

•   -     Ttf  (t)dt   -    (tXexp  (-Xt)dt  i— 
Thus,  In  the  exponential  case,  the  mean  time  between  failures  is  equal 
to  the  reciprocal  value  of  the  failure  rate  Oc.    Other,  nonexponential 
density  functions  also  have  mean  time  between  failures,  but  they  are  not 
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the  reciprocals  of  the  failure  rata*. 

The  density  function  can  be  used  to  determine  an  Important  measure 
of  the  performance  of  an  equipment  called  the  mean-time-between-fai lure 
(HTBF).    Since  MTBF  Is  an  area  of  primary  concern  in  reliability  pre« 
diction  Its  derivation  and  usage  will  be  conslderad  hera. 

ftatamlnation  of  MTBF 

There  is  no  statistical  relationship  between  the  MTBF  and  life  of 
an  aqulpmant  or  part.    Many  parts  have  become  so  reliable  that  they  may 
be  considered  to  have  an  Infinite  MTBF  and  yet  a  comparatively  short  Ufa. 
Others  can  be  considered  to  have  an  Infinite  life  and  a  cos^aratively  short 
MTBF.    This  MTBF  can  be  determined  for  a  system  in  either  of  the  follow- 
ing two  ways.  (13) 

A  calculation  is  performed*  based  on  the  summation  of  the  predicted  part 
failure  rates.    If  the  Individual  part  failure  rates  are  expressed  in 
percent  per  1,000  hours  and  are  slgnlflad  by'V,,  X  »X  »  »  than  B 

tlM  MTBF  Is  determined  as  follo%«s: 

•   -   l(£  

A  I  ♦  a.  J     .    .  .X^ 

Method  2 

(Starvations  are  made  of  the  systems  performance  during  actual  operation 
altliar  In  the  field  or  with  tests  that  simulate  actual  field  conditions, 

«»<'  -  t  Ti    +  -1:  TJ 

l»l  i"l 

f 


a 


where  0  is  the  best  estimator  of  MTBF,  Ti  is  the  time  duration  in  hours 
of  the  Ith  observation  of  n  total  observations  each  of  **hlch  terminated 
with  a  failure.    TJ  Is  the  time  duration  In  hours  of  the  Jth  observation 
of  m  total  observations  each  of  which  terminated  prior  to  the  occurence 
of  A  failure,  and  f  is  the  total  number  of  failures.    An  example  of  this 
method  is  shown  below:  (13) 


UNIT 

HOURS  OF 

NUMBER  OF 

HOURS  NOT  ENDING 

FAILURE  TI 

FAILURES  f 

IN  FAILURE  TJ 

1 

25 

1 

71 

I 

60 

1 

40 

3 

IS 

1 

75 

1 

10 

% 

• 

100 

1 

80 

?o 

TOTALS 

255 

s 

2^ 

IIT1F:    9  -         *         -    »00  hours 
S 

Only  the  first  of  these  two  methods  can  be  used  to  predict  the  re- 
liability  of  a  system  bafore  the  module  Is  actually  constructed.    In  order 
to  utilize  this  first  method,  failure  rates  of  component  parts  must  be 
determined  either  from  testing  or  from  appropriate  tables.    With  the  number 
of  hours  an  equipment  is  expect  to  operate  between  failures  becoming  \n» 
creasingly  large,  it  will  be  necessary  to  deteraine  KTSF  In  terms  of  days 
and  yaars  rather  than  hours.    For  the  space  program  to  be  effective.  It 
Is  estwitlal  to  be  able  to  predict  MTBF  or  the  reliability  of  a  sy*t«|i 
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without  relying  on  life  testing. 

RELIABILITY  PREOICTIOM 

Reliability  prediction  ts  a  method  of  forecasting  the  probable 
reliability  of  a  device  by  means  of  past  experience  or  statistical 
nathods.    The  primary  reason  for  predicting  reliability  Is  to  determine 
whether  the  design  of  a  system  is  sufficiently  mature  to  ensure  mission 
success.    However ,  because  of  economical  reasons,  It  Is  essential  to 
work  toward  an  optinun  reliability.    The  old  concept  of  putting  a  systea 
Into  operation  In  order  to  test  Us  reliability  becomes  economically  pro- 
hibitive with  the  complexity  of  today's  systems. 

In  using  statistical  methods  to  predict  reliability,  it  must  be 
realtxed  that  they  are  not  used  to  refine  proposed  designs  of  unknoMn 
quality,  but  to  establish  on  a  probabilistic  basis  what  is  icnown  about 
performance  characteristics.    A  given  system's  reliability  is  predictable 
only  in  the  sense  of  performance  limits  within  %<hlch  it  will  function  with 
a  pre-asslgned  level  of  probability  for  doing  so.    Before  selecting  a 
particular  method  to  be  used  for  predicting  reliability,  an  analysis 
should  be  made  using  the  following  criteria:  (12) 

1.  Project  Requlronents— Ooes  the  project  require  that  a  specific 
technique  be  employed. 

2.  Purpose  of  Prediction— RellabI llty  predictions  may  be  used  to 
establish  adequacy  of  proposed  designs,  to  measure  conpliance  with  relia* 
blllty  specifications,  and  to  analyze  design  Improvements.    The  two  re* 
quire  a  high  degree  of  accuracy  whereas  evaluation  of  alternate  designs 
can  utilize  a  simplified  method. 
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3*    Tha  Type  of  Equipment  In  •  System—Several  taelmiquet  relating 
to  different  types  of  equipment  are  available.    Swltchlng«ctrcult  analogy, 
redundancy  techniques,  and  situations  v^ere  tlie  results  of  different  type 
failures  are  Important,  must  be  considered. 

4.  Phase  of  Oeslgn*«The  phase  of  the  design  process  determines  the 
aaount  of  detail  Information  available  about  the  equipment. 

5.  Degree  of  Accuracy  Deslred*-The  refinement  of  a  prediction  to 
Include  confidence  limits  associated  with  estimates  and  variations  in 
(^erational  requirements  necessitates  more  advanced  prediction  techniques. 
As  a  result  of  the  above  analysis  It  Is  possible  to  determine  the  tech- 
nique most  applicable  to  meet  any  requirements. 

Some  of  the  techniques  available  to  the  engineer  are  the  fol lowing: (12) 

1.  The  technique  based  on  the  product  rule  and  simple  redundancy 
considerations.    This  procedure  Is  valid  where  parts  composing  a  system 
or  sub-systems  within  an  equipment  operate  in  a  simple  series  or  redun- 
dant configuration. 

2.  Another  approach  Is  prediction  by  equipment  function.    This  tech- 
nique involves  comparing  the  system  or  its  parts  with  that  of  existing 
devices  with  tcnown  reliability. 

3.  A  technique  useful  in  the  early  stages  of  design  of  electronic 
equipment  is  the  active-eleromt-group  (AEG)  concept.    By  definition,  an 
(AEG)  consists  of  a  tube  or  a  transistor  with  a  proportionate  share  of 
the  resistors,  capacitors,  coils,  transformers,  and  other  parts  which 
form  the  module.    Failure  rata  of  tha  n«w  system  Is  predicted  by  deter- 
mining the  sum  of  the  products  of  the  number  of  AEG's  times  their  failure 
rates. 


J|.   A  fourth  technique  s«nettm«s  termed  "Cause  and  Effect  Analysis", 
Is  more  qualitative  than  quantitative.    The  application  of  this  technique 
requires  a  detailed  systeaatic  analysis  of  the  relationship  of  various 
parts  to  the  i^le;  identification  of  modes  of  failure  and  the  effects 
of  such  failures;  and  analysis  of  means  of  eliminating  failures. 

fUgardless  of  the  technique  selected,  there  are  certain  procedures 
that  should  be  followed  In  predicting  reliability.  The  steps  listed  by 
ARINC  Research  Corporation  are  probably  the  most  comprehensive  and  will 
be  explained  itere:  (26) 

I.  Define  the  System—The  task  of  defining  the  system  consists  of 
describing  functions  and  limits  of  parts  or  sub-systems. 

t.   dafine  Fal lure— Normally  failure  Is  described  et  any  condition 
which  renders  the  system  incapable  of  operating  within  Its  specified  per- 
fofWMica  parsnetar  limits.   Any  other  concept  of  failure  should  be  labalad 
as  such. 

3.    Define  Operating  and  Malntenence  Condi tions^-Operating  conditions 
Include  the  environmental  conditions  prevailing  during  various  periods  of 
operation.    This  Includes  a  concept  of  duty  cycles  which  has  became  of 
importance  in  recmt  years.    There  is  evidence  of  a  requirement  for  fail- 
ure rates  during  off-duty  time  as  noted  in  the  following  quote:  (16),  P>  86 
"The  traditional  practice  of  using  component  part  failure  rates  derived 
from  field  data  to  predict  the  reliability  of  future  equipments  wittwut 
any  regard  to  duty  cycle  can  result  in  erroneous  predictions."  The  same 
results  were  noted  In  a  study  by  IBM.    The  I8H  Space  Guidance  Center  had 
analyzed  the  reliability  of  more  than  100  transistorized  military  guidance 
conputers  over  a  period  of  two  years.    Analysis  of  two  sub-groups  of 


is 


computers  showed  that  the  sub-group  that  had  experienced  300  percent 
more  "on  tlrae"  had  operated  for  170  percent  longer  time  between  failures. 
Maintenance  conditions  become  important  for  determining  replacement 
schedules  and  preventive  maintenance  times. 

i».    Construct  Reliability  Block  0 1 agram— Severa I  block  diagrams 
might  be  required  to  separate  the  systom  In  sub-systems  or  even  parts 
depending  on  the  coaq>lexlty  of  the  systom.    Primary  consideration  should 
be  given  to  arranging  blocks  with  regard  to  redundancy,  duty  cycles, 
separate  failure  rates,  and  parallel  or  series  circuitry.    If  systsm 
operations  or  environments  vary  during  a  particular  mission,  this  must 
be  constructed  as  a  separate  block  diagram. 

5.    develop  Reliability  Formulas— Some  examples  of  basic  formulas 
for  computing  reliability  are  considered  here:    (I),  p.  85-120. 

a.    If  a  component  has  a  reliability  Rj  and  another  has  a 
Reliability  Rj,  then  the  probability  that  both  will  be  operating  at  time 
(t)  It: 

\  (t)  -  R,  (t)  •  Rj  (t) 

In  the  e)cponential  case  with  constant  failure  rate  this  becomes: 

R,  (t)  -  exp[[-      \|  dt])  •  expj^-  ^  ^  ''ll 
or 

For  n  sub-systems  In  series  with  failure  rates  equal  toX^  the  syst«R*s 
reliability  becomes: 

n 

Rj   -  exp  (  t) 
i»l 

The  probability  that  either  one  or  both  of  the  con^x>nents  will  survive  is 

Rp(t)  -  Ri  (t)  +  RjCt)  -  Ri(t)  •  RjCt) 
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and  again  using  tha  «cponentta1  easa  with  constant  fallura  rates  tha 

equation  becomes: 

Rp(t)    -   exp  L-  "^i  xi  +  axp  -"2 
b.    The  rallabinty  of  n  components  in  parallel  Is  given  below  with 


Q, (t)  meaning  the  unreliability: 
•  n 


Kit)  -  I  -  II  a,(t) 
^  i-i 


If  the  components  In  parallal  have  equal  fallura  rates,  which  Is  vary 
often  true,  the  equation  sinpllfias  to: 

Rp(t)  -  1  -      -  [l»  l-exp(-^'t3" 
wtera  Q  Is  the  unreliability  of  one  con^>onent  and  n  ■  numijar  of  cowpowawtl. 

c.  Another  consideration  Is  the  reliability  of  stand-by  systaaa* 
For  a  stand-by  systan  of  three  units  with  tha  sasMi  failure  rata,  where 
one  Is  operating  and  tha  other  two  are  standing  by,  the  reliability 
formula  is: 

•»»  (-^t)  •    (I  ■»  >  t  ♦  '-^  t^  ) 
In  gwiaral  with  n  equal  eonponents  standing  by  the  formula  is: 

*K-  wcp  («it)  •  0  *\*]LA^**  '-^  Ax) 

The  advantage  of  stand-by  arrano-«ien<;s  ^esultc  not  fron  a  significant 
Increasa  In  reliability,  but  In  a  considerably  longer  MT8F.    In  the 
formulas  above  the  reliability  of  switching  devices  was  considered  to 
be  100  percent  however,  if  switches  have  other  than  ICQ  percent  rella- 
blllty  they  roust  be  included  as  indicated  below  with  one  stand-by  sys- 
tem backing  up  an  operating  system.    R^^  ■>   Switching  Reliability 

lb  -  anp  <-^t)  +  R„  exp  (-Xtlxt 

d.  A  final  consideration  can  be  utilized  whe^'her  or  not  the 
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coroponents  ar«  equal  or  fail  exponentially.    This  con»ist»  of  first  de- 
riving the  density  function  of  systems  in  stand-by,  and  then  obtaining 
the  cumulative  reliability  of  the  system  by  the  integration  of  the  density 
function. 


«.    All  of  the  above  stand-by  formulas  assume  that  the  stand-by 
unit  does  not  fall  while  nov  In  operation,  but  as  was  indicated  earlier 
in  this  report— off  duty  time  can  contribute  significantly  to  failure  of 
an  equipment.    For  this  type  of  situation,  two  different  failure  rates 
for  stand-by  systems  must  be  considered  as  follows: 


has  an  operating  failure  rate  of\^  and  an  Idle  failure  rate  ofX^* 
Switching  reliability  is  assumed  to  be  unity. 

f.    Not  all  reliability  problems  can  be  reduced  to  the  simple  for- 
mulas considered  here.    There  are  many  new  techniques  being  developed  In 
an  effort  to  obtain  valid  reliability  predictions  in  complex  situations. 
Some  of  these  techniques  include  Monte  Carlo  methods,  linear  prograioBing, 
queuing  theory,  Bayes  theororo,  and  various  distribution  theories. 
Narkovian  techniques,  wliere  failure  rates  change  with  tioMS,  can  be  used 
to  consider  the  effects  of  euaponent  drift  and  catastrophic  failure.  (18) 

Other  advanced  analysis  procedures  are  being  developed,  but  the 

underlying  problem  remains  lu  construction  of  the  mathematical  model. 

The  effects  of  interactions  of  the  outputs  of  various  parts  and  of  their 

assemblies  into  sub-systems,  may  be  estimated  frcM  ttie  design,  or  from  rax* 
lated  test  data. 


R^(t)  -  exp  {-^,t)  + 


exp(-^2t)  -exp  -(?-|  +^)t 


The  operating  component  lias  a  failure  rate  the  stand-by  component 
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6.  CaB^ll«  Farts  Lists— For  each  block  on  tharrellabi lity  block 
diagram.  Individual  parts  should  be  listed  in  some  convenient  order. 
Parts  lists  should  include  part  descriptions,  pertinent  ratings,  and  space 
for  entering  (H>arating  voltages,  currents,  power  dissipation,  stress 
indices,  and  failure  rates. 

7.  Parfom  Stress  Analysis— in  a  reliability  stress  analysis,  op- 
erational paraMtart  such  at  power,  voltage,  current,  horsepower,  system 
pressure,  flow  rate,  etc.,  or  environmental  paraneters  such  as  tempera- 
ture, altitude,  hunidity,  vibration,  radiation,  etc.,  are  plottai  afaintt 
failure  rates,    in  this  report  the  environmental  parasMtar  of  ambient 
t«[i|»aratura  plotted  against  part  failure  rate  will  constitute  the  primary 
StfMS  analysis.    In  Mny  Instances,  operational  or  environmental  param- 
eters are  plotted  against  "application  factors"  or  "operational  multl* 
pliers."   The  prof^uct  of  thase  multipliers  and  the  basic  failure  rate 
detamiines  the  gross  failure  rate  under  particular  environmental  stresses. 
Military  Standardization  Handbook  217  (21)  describes  in  great  detail  tha 
methods  for  making  stress  analysis  on  electron  tubes,  semiconductor  da* 
vice«,  resistors,  capacitors,  transformers,  inductors,  coils,  relays, 
switches  and  other  parts.    In  using  multipliers  or  correction  factors, 
tha  failure  rate  equation  normally  takes  on  tha  following  forai 

where is  the  adjusted  failure  rate: 

\^  Is  the  basic,  or  standard  failura  rata, 
K|  corrects  for  applied  stresses; 

K    relates  the  proportion  of  likely  tolerance  failures  to 
rSndoro  catastrophic  failures; 

adjusts  for  changes  In  external  environments; 
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K.  is  a  possible  adjustment  required  to  account  for  different  main* 
tenance  practices  whicti  can  have  an  effect  on  observed  system 
failures; 

denotes  system  complexity^tiie  more  conptex  the  system,  the 
eater  Mill  be  its  failure  rata; 

accounts  for  observed  cycling  effects. 
*  b 

ScMM  typical  environmental  multipliers  recommended  by  Hllltary  Standard- 
ization Handbooic  756  are:    (26)  p.  318. 

Shipborne/Flxed  Ground  1*0 
Aircraft  6.S 

Missiles  80.0 

Satellite:  Launch:  Boost  Phase  80.0 
Satellite:    Orbit  Phase  1.0 

8.  Assign  Part  Failure  Kates  or  Probabilities  of  Survival— -Fa I  lure 
rates  wi  1 1  be  extracted  fron  dependable  data  sources  such  as  FARAOA  or 
Military  Standardization  Handboolcs.    The  stress  indices  determined  In 
the  preceding  step  will  be  applied,  and  if  they  vary  during  different 
phases  of  the  mission  will  be  assigned  separately. 

9.  Combine  Part  Failure  Kates  or  Probabilities  of  Survival  to  Ob-  . 
tain  Blocic  Failure  Kates  or  Rei  iabi  lities— The  matheroaticai  models  pre- 
viously determined  are  used  to  combine  failure  rates  into  resultant 
biocl(  failure  rates.    These  failure  rates  are  modified  to  account  for 
tolerance  failures  and  use  conditions. 

10.  Compute  System  Rel iabi lity— System  reliability  is  ccnputad  by 
entering  the  blocic  reliabilities  and  failure  rates  In  the  system  relia- 
bility fonaule  and  solving  for  time  periods  or  mission  phases  of  interest. 
Reliability  estimates  for  the  various  mission  phases  should  Ini  oomblned 

to  show  system  reliability  for  the  entire  mission. 
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Confidence  On  HTBF  Prediction 

Frequently  the  peraneter  of  Interest  Is  the  Mean-Tlme-Between- 
Fal lures  (MTB0>  and  this  can  be  calculated  by  evaluating  the  integral 
of  the  reliability  function  from  0  to<»  If  the  system  components  failures 
r«Mtn  exponential.    In  a  nonexponantlal  system,  for  axaople  redundant 
systems  or  systems  where  MMrouts  occur,  MTBF  becomes  a  function  of  re- 
placaoMNit  tiM  (T).   An  Interesting  correlation  of  predicted  and  actual 
HTBF's  can  ba  Man  In  tha  following  table:    (21)  p.  288. 

COMPARISON  OF  OBSERVED  AM)  CALCULATED  RELIABILITIES 


Equipment  Calculated  Obt«rv«d  MTtF 

MTiF  MTBF  90K  CI 

In  hours  In  hours  Estimate 


AIRBORNE 

258*508 

Weather  Radar 

366 

350 

Fire  Control 

214 

106-206 

Communications 

125 

m 

112-135 

GROUND 

Search  Radar 

7** 

58-70 

Radar  Identification 

425 

339 

278-425 

Communications 

399 

374-525 

Fire  Control  Display 

95 

Bk 

66-111 

Designation  Display 

183 

185 

132-181 

The  Interval  of  KTBF  values  given  in  the  last  column  Include  the  TRUE 


(but  unknoMn)  MTBF  90%  of  the  time  for  each  equipment.  These  Intervals 
Mare  calculated  In  accordance  with  the  relationships: 

UMwr  Limit  -   2r  (Obs.  MTBF)  Upper  Limit  -   2r  (Obs.  MTBF) 

tdiirat    r     nunbar  of  fai  lures  observed 


Denominators  ■  values  from  Chl-squara  distribution  table  for  an 
alpha  ""  10%      2r  Is  the  number  of  degrees  of  freedom. 
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TIm  obMiv«d  HTBF  VMS  calcul«t«d  by  sumlng  all  th«  operating  tlaM  ac* 
cuRMilatad  by  all  tha  coMponants  during  tha  tatt,  and  dividing  by  tha 
number  of  failuras.    It  should  ba  notad  that  only  two  of  the  predicted 
values  fall  outside  of  the  90  percent  CI  on  tha  actual  tast.    Both  of 
thasa  astlmatas  ware  within  8  hours  of  the  estimated  CI. 

Mian  It  is  required  that  a  particular  system  have  an  MTBF  which 
axcaads  a  specified  minlnum  value  with  a  probability  of  (1  -c(),  i.e.* 
at  a  confidence  level  of  100  (I  -ct)  percent,  a  one-sided  Chi-square  taat 
Isiusad.    it  nuit  be  proven  that:    (I),  p.  236* 

0  ~  C|  ^(Si  ZT,  III.- 
*•  2r 

Of,  that  in  an  accumulated  test  tina  of  T  not  mora  than  r  failuras 

T  -  C   Xdi  2r  

2 

hava  occurrad.  Any  integar  can  ba  choaan  for  r.  Utilizing  this  infor«> 
isatlon  and  assuming  that  the  wearout  period  Is  nomally  distributed.  It 
is  possible  to  estimate  when  the  wearout  period  begins.  Reliability  is 
Increased  and  wearouts  are  allmlnatad  If  component  replacement  or  over- 
haul tine  Is  thusly  established.  For  a  tast  truncated  after  a  particular 
test  time  tha  ona-sided  limit  on  the  estlaate  of  HTBF  becomes: 

Student's  T  distribution  can  be  used  to  determine  the  limits  if  a  sample 
size  of  under  25  is  desired  in  the  test. 

Truncated  tasting  and  small  samples  may  ba  used  to  determine  Mtimataa 
of  tha  MTBFf  but, this  causes  a  fairly  large  confldmce  interval  as  indi« 
cated.    Another  possibility  to  consider  would  be  a  means  of  accelerating 
failures  of  components. 
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ACCELERATED  TESTING 

TtM  "race  for  space"  has  created  a  need  for  refining  the  process 

known  as  accelerated  testing.    Earlier  In  the  report  failure  rate  data 
Mas  considered  the  pricnary  source  of  information  for  deterainlng  com- 
ponent reliability  and  consequently  system  reliability.    Because  of 
coRq»leKity  of  systems  and  changes  in  design  it  Is  often  necessary  to 
develop  data  through  testing  of  devices  or  systems.    Here  is  an  excerpt 
from  a  report  on  the  testing  being  done  at  General  Dynamics,  Ft.  Morth 
and  its  parts  venders  to  insure  high  reliability  on  the  ml litary's  now 
variable  wing  aircraft  (F-ltl).  (24) 

"The  n«M  *prmm  iMrfomance  specs'  Military  Stand* 
•rdization  Mawdbeafc  l!^00,  p.  Sll-il^  are  based  on  a  con- 
cept of  100  percent  testing  of  devices  at  full<*rated  power 
or  above  for  long  enough  to  weed  out  any  failures.  Offi- 
cials at  General  Dyfunlcs,  Ft.  Worth  are  also  studying  the 
feasibility  of  spacing  integrated  circuit  functions  rattier 
than  measuring  citaracteri sties  of  individual  components. 
Soma  electrmilc  sub-s/stems  used  in  the  F-Ill  avionics  have 
been  on  continuous  operating  test  for  more  than  a  year  wi tit- 
out  a  single  component  failure." 

Two  problms  became  ^parent  In  view  of  the  procedures  used  at  General 
Dynamics.    The  first  of  these  Is  that— specs  for  Integrated  circuits  or 
sub-systems  have  not  yet  been  developed.    Tlie  second  is  that  as  relia- 
bility is  improved  testing  becomes  an  increasingly  time  consuming  process. 
Mean  time  between  failure  must  eventually  reach  the  point  where  the  cost 
of  lengthy  testing  becomes  prohibitive.    With  trips  to  near  planets  pre- 
dicted for  the  next  decade,  mean  time  between  failure  roust  be  computed 
in  terras  of  years  rather  than  itours.    In  order  to  arrive  at  reliable  fail* 
ure  rate  data,  or  to  predict  HTBF  with  any  degree  of  confidence,  life 
testing  would  require  placing  the  system  in  operation  for  a  period  equi 
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equivalent  to  the  duration  of  interest.    A  possible  solution  to  this 
problem  is  subjecting  electronic  sub-systems  to  accelerated  testing 
teclin  i  ques . 

One  form  of  accelerated  testing  employs  high  stresses  to  accel- 
erate failure.    By  applying  a  factor  which  relates  stress  and  lifetlsMe 
It  can  be  determined  in  a  short  period  under  high  stresses  what  may  be 
•iipected  to  happen  in  a  much  longer  time  under  normal  stresses.    Same  of 
the  problems  encountered  with  this  type  of  testing  include: 

1.  The  correctness  of  the  stress-liiie  relation  for  the  conponents 
or  sub-systans. 

2.  Thw  equivalence  of  the  unit  tested  to  the  unit  actually  used. 

3.  The  correct  evaluation  of  effects  of  simultaneously  combined 
stresses  in  their  related  magnitudes. 

k.    The  size  of  the  sanple  tested  and  the  variability  of  the  results. 

Another  form  of  accelerated  testing  r<tquires  testing  an  Increased 
number  of  components  in  order  to  obtein  in  a  short  time  e  large  number 
of  component  operating  hours.    The  difficulty  encountered  here  is  that 
inferences  about  life  distributions  must  be  made  from  truncated  samples 
which  often  can  lead  to  erroneous  predictions. 

The  use  of  accelerated  tests  Is  often  proposed  to  reduce  testing 
time  and  testing  costs.    The  procedure  Is  to  determine  failure  rates 
under  high-stress  conditions  and  extrapolate  the  results  to  give  an  es- 
timate of  anticipated  failure  rates  under  use  conditions.    Studies  are 
now  being  carried  on  by  various  manufacturers,  and  experience  Indlcatet 
a  high  degree  of  difficulty  in  obtaining  precise  acceleration  factors  for 
most  electronic  parts.  (19) 


One  of  th«  most  thorough  studies  run  so  far  in  the  field  of  accel- 
•ratod  testing  techniques  was  accomplished  by  the  General  Electric 
CoH^y  on  Contract  AF  30(602)-3'«lS.    (27)    The  purpose  of  that  pro^rm 
was  to  study,  Investigate  and  devalop  the  testing  and  measurement  tech- 
niques for  controllable  accelerating  of  electronic  parts  aging;  and  to 
perfonm  an  investigation  study  and  analysis  of  the  failure  oechanisns  of 
the  high  reliability  parts  used.    In  the  test,  externally  produced  ther- 
mal energy  was  selected  as  the  degradation  stress  for  resistors  and  semi- 
conductors.   This  selection  was  based  upon  the  fact  that  changes  In  the 
electrical  properties  of  these  types  of  electronic  parts,  with  respect 
to  time,  are  the  result  of  chemical  and  physical  changes.    These  reactions 
usually  are  accelerable  by  application  of  thermal  energy.    Since  thermal 
energy  was  selected  as  the  degradation  inducing  stress  for  the  majority 
of  tile  tests,  it  was  necessary  to  precisely  control  and  measure  the 

o 

critical  element  temperature.    This  was  done  by  using  ovens  with  a  ±  2  C 
temperature  control,  and  the  use  of  a  nitrogen  atmosphere  to  retard  oxi- 
dation of  the  laads.   An  upper  limit  of  350^  for  resistors  and  300^  for 
semiconductors  was  set  because  changes  In  failure  mechanism  wera  observed 
at  higher  teniperature.    Average  rates  at  which  the  parts  failed  were 
datennlned  by  a  series  of  accelerated  tests-to-fai lure.    These  results 
were  used  to  establish  life  characteristics  with  statistical  confidence 
limits.    One  of  the  methods  for  establishing  life  characteristics  Is  the 
step  stress  technique. 

Step  Stress  Technique 
The  step  stress  tecimique  for  accelerated  tasting  consists  of 
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considering  stress  as  the  independent  variable  and  some  function  of 
damage  as  the  dependant  variable  of  deterioration.    The  procedure  is  to 
start  at  a  stress  level  wiiera  deterioration  is  knotun  not  to  be  signif- 
icant and  increasing  stress  In  Incraaants  until  deter iorat Ion »  observed 
in  terms  of  daMi^e,  becomes  significant. 

The  accelerated  step  stress  tests  for  the  seal conductors  In  the 
•anerai  Electric  study  indicated  an  excellent  means  of  datemtning  long 
life  capability  of  parts.    Using  step  stresses  of  24  hour  and  120  hour 
duration  and  seniles  of  50  R200^5  diodes,  a  plot  of  Junction  tenq>era* 
ture  versus  the  normal  probability  scale  MOt  oMda  as  indicated  below: 
(27)  p.  2.29. 


Junction  Temperature 
lAabs. 


NORMAL  FROBABiLlTY  SCALE 
Selecting  the  1  percent  and  50  percent  failure  points  for  each  phase 
test,  these  points  were  replotted  on  a  junction  temperature  versus  log 
time  scale.    Linear  extrapolation  shows  a  i  percent  failure  occuring  at 
about  8000  hours.    The  overall  study  concluded  that  a  definite  correlation 
could  be  established  between  accelerated  tests  and  life  tests  provided 
a  daRWge  paraneter  can  be  determined  for  parts  which  will  be  linoar  wiMn 
plottad  against  degrees  centigrade  on  a  1/T  Kelvin  Scale. 
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Th«  abov*  infonMtlon  can  best  be  tlluttratsd  by  the  following  tabUt 
(27)  p.  2.29. 


sot  FA\iURi 


Junction  °Cent. 
Ten^ierature  375 
I/Tabs.  350 
325 
300 
275 
250 
225 
200 
175 
150 
125 


100 


10  100  1000  10.000 

TIHE-HOURS  TABLE 

Asauning  an  exponential  reliability  function,  an  acceleration  factor 
can  be  established  from  the  following: 

Mhara  X^To)  ■  fai lure  rate  at  uae  temperature 

W4  'X  (T|)  ■  fal  lure  rate  at  aecalaratad  temperature 

"There  Is  much  evidence  (II)  to  conclude  that'«.(t)  Is  a 
linear  function  of  the  reciprocal  of  the  operating  temp- 
erature T  in  degrees  Kelvin,  therefora^ 


X  (t)  -  A  *  b/T 


or 


-^To)  -    (T,)  -  b  (4 


I 


Tharafore,  if  the  degradation  rate  (b)  can  be 
estimated  from  experimental  data  the  acceleration  factor 
(A)  is  easily  determined. 

For  the  Weibull  distribution  It  can  be  proven  that 
the  acceleration  factor  is:" 


I  -  1  ) 
To  TT 


i7 

and  It  It  Iii4«p«n4«iit  of  time.    Again  If  the  degradation  rate  (b)  can 
be  detenalned  then  (A)  can  be  found.    Proof  of  the  derivation  of  this 
can  be  found  In  RAOC-TOR-64- 142 .    The  author  atatnas  that  i  Is  rela- 
tively iiMi«M(Ml«nt  of  tatqierature.  (1) 

Matrix  Testing 

Another  conventwit  method  of  determining  acceleration  factors  Is 
tfMcribod  by  P.  H.  Greer  (II).    Greer  uses  a  'taatrlx"  type  experiment 
In  which  combinations  of  environmental  conditions  such  as  tmperature, 
voltage,  and  power  are  used  to  stress  the  units.    The  proportions  de- 
fective under  each  condition  of  stress  are  estimated  and  regression  tech- 
nlqiMS  are  used  to  estimate  reliability  factors  over  a  large  number  of 
stress  conditions. 

A  Mtrix  test  which  was  developed  by  Motorola  as  a  part  of 
Mlnutcnan  Transistor  Reliability  Improvement  Program  Is  shown  below. 
The  program  consisted  of  testing  a  number  of  devices  at  several  com- 
binations of  anbiant  or  case  tenperatureSt  and  percentages  of  rated 
power  to  accelerate  the  potential  dawtce  failure  mechanisms.    A  total 
of  9*675  devices  were  tested  at  power  levels  from  0  to  133  percent  of 
rated  dissipation  and  under  eight  ambient  temperature  conditions  for 
4,000  hours. 

Failure  rates  were  plotted  against  Junction  teaparature  on  a  l/K^ 
scale  and  the  plots  formed  a  straight  line.    Higher  failure  rates  occurred 
for  the  15V  test  than  for  the  5V  test,  indicating  that  this  type  of 
device  Is  affected  more  by  the  voltage  field  effect  than  by  current 
density.    However,  the  slopes  of  the  5V  and  I5V  plots  were  approximately 
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th«  same.    From  the  failure  rate  plots,  It  is  possible  to  determine  an 
acceleration  factor  at  any  test  condition.    One  Important  precaution 
which  roust  be  observed  in  matrix  testing  as  well  at  any  other  accel- 
•rated  test  plan  is  the  assurance  that  no  new  failure  mechanism  it 
Introduced.    If  the  increased  stress  tests  introduce  new  failuremach- 
•nliat,  then  the  validity  of  predicting  long-term  life  reliability  It 
tost.    Hatrlx  tatting  then  it  another  meant  of  determining  acceleration 
factort:    (11)  p.  10. 

MATRIX  FOR  RELIABILITY  TESTING 


NJl  of  Itatad  : 
na^  r^o»#»r 

Ambient  \^ 
TeroperatureN, 

0 

33 

66 

100 

133 

Voitt 

Volts 

Volts 

Voltt 

Volts 

0 

5  15 

5  15 

5    15  ^ 

25**  c 

3000 

1000 

400 

200 

50*>  C 

1500 

1000 

200 

150 

75**  C 

soo 

400 

too 

tso 

100 

o 

100  C 

200 

MOTE: 

Quantities  of  devices  siwwn  in 
operating  cells  ware  equally  dlvl4«l 
between  the  two  voltage  levels  used. 

125**  C 

100 

o 

150  C 

75 

o 

175  C 

50 

200**  C 

50 

Some  of  the  conclutlons  made  by  Motorola  as  a  result  of  their  Accelerated 
Life  Tests  warrant  consideration  here.    (II),  p.  145. 


I.    Accelerated  life  testing  can  be  used  to  develop  mathematical 
models  from  which  the  failure  rate  at  any  desired  time  and  temperature 
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can  be  computed  with  relatively  good  correlation  with  observed  results. 

2.  Matrix  type  stress  testing  appears  to  have  considerable  value 
at  a  MMS  of  obtaining  information  regarding  the  effect  of  time  and 
temperature  on  failure  rate. 

3.  Standard  sequential  step  strata  tatting  Is  not  consistently 
adequate  for  establishing  acceleration  factors  between  short  term 
testing  (1<*  hour  to  56  hour  Intervals)  and  long  term  testing  (1,000 
hours)  of  the  Motorola  PNP  silicon  expitoxial  planer  2NI132  transistor. 
It  it  vary  probable  that  tha  cumulative  effects  of  both  time  and  previous 
ttratt  levalti  which  occur  In  tequentlally  ttep  ttressing  the  same  de* 
vicat*  sometimes  hide  true  failure  rate  Indications. 

h.   The  sequential  step  stress  tasting  tachnlqua  it  vary  affective 
In  comparing  the  relative  reliability  of  two  or  more  samples. 

Other  sources  agree  on  the  determination  of  acceleration  factors, 
but  doubt  tha  feasibility  of  using  accelerated  testing  on  complex  di^lces. 
Their  conclusions  are  based  on  ttie  current  lacic  of  knowledge  regarding 
statistical  handling  of  caRm>eting  failure  risks.  (26) 

Before  attempting  any  accelerated  testing  approach  certain  areas  of 
knowledge  must  be  established: 

1.  Knowledge  of  the  modes  of  failure  and  their  physical  causat 
that  occur  at  usage  conditions. 

2.  A  sufficient  knowledga  of  the  dependency  of  failure  behavior  on 
experimental  conditions  to  Justify  an  extrapolation  from  acceleratadi 
conditions  to  usage  conditions. 

If  this  knowledge  can  be  attained,  and  assumptions  regarding  forms 
of  failure  distributions  and  relations  of  distributiont  to  environmental 
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conditions  can  be  aade,  accttltratttd  tasting  can  become  an  eff active 
prediction  too). 

AN  AIRBORNE  EUCTRONIC  SYSTEM  EXAMPLE 

Utilizing  the  background  Information  described  here  it  would  ba 
possible  to  construct  a  progran  to  test  the  f«Mibllity  of  predicting 
MTBF  for  systems  by  experlm«fiting  with  various  environmental  conditions. 
Following  the  reasoning  established  by  Genera)  Electric  in  RAOC-TOR-64- 
k&\    (27)  thermal  energy  was  used  as  the  discriminating  environmental 
factor.    A  module  from  an  airborne  equipment  with  component  parts  loiown 
was  constructed. 

Analysis  of  Data 

Failure  rate  data  for  the  component  parts  was  obtained  from  Military 
Standardization  Handbool(s,  and  from  the  company  producing  the  parts. 
For  the  particular  module  under  consideration,  Collins  Radio  Company 
supplied  tha  failure  rata  curves  based  on  information  from  the  follow- 
ing sources:  (13) 

1.  Actual  experienced  part  failure  rate»         obtained  from  SAC 
Ground  Station  Control  Center  equipment  with  a  total  accumulated  time  of 
over  1.7  billion  part-hours. 

2.  Nuaerous  part  vendors. 

3.  Military  and  civilian  study  programs  such  as  the  RAOC  Relia* 
blllty  Noteboolt,  section  8,  and  the  Vitro  Technical  Report  133.  Th&so 
Failure  rate  curves  were  constructed  using  the  most  influential  stress 
factor,  thermal  energy,  as  the  independent  variable.    If  thermal  energy 
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MS  not  the  only  dominating  factor,  several  curves  were  added  at  various 
levels  of  the  other  stress  factor.    Stress  factors  used  In  this  report 
were: 


PART  TYPE 

Electrolytic  Capacitor 
Ceramic  Capacitor 
Hica  Capacitor 
Paper  Capacitor 
Diode 
Transistor 

Transformer  and  Inductor 
Relay 

Con^>osltton  Resistor 
Fi Iro  Resistor 
Wire-wound  Resistor 
Tube 


FACTOR(S) 

Body  tcr^perature 

Voltage  and  Body  Temperature 

Voltage  and  Body  Temperature 

Voltage  and  Body  Tomperature 

Junction  Temperature 

Junction  Temperature 

Itot-spot  Temperature 

Contact  load  and  duty  cycle 

Body  Temperature 

Body  Temperature 

Body  Temperature 

Power  Dissipation  and  Bulb 
Temperature 


In  considering  stress  factors,  the  hot-spot  temperature  was  found  by  the 
change- in-resi stance  method  (Mi I-T-27A)  using  the  following  formula: 

T^    -  R  -  r  (T  +  234.5)  +  2t  -  T  in  centigrade 

hs  ^ 

Ullirt  hot-^t  tMperature 

t        ambient  temperature  prior  to  power  application 
T  is  the  embient  temperature  after  power  application 
r  Is  the  winding  resistance  teken  at  t 
R  Is  the  winding  resistance  at  T 
For  determining  the  Junction  temperature  of  a  SMtconduetor,  the  easa 
temperature  is  added  to  the  product  of  the  |;.ower  dissipation  and  the 
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thermal  resist«>ce. 

Conputat tonal  Techniques 

Making  the  same  assumptions  described  previously  concerning  early 

failures  and  wearout  failures  Che  MTBF  can  ba  pradlcted  mathematical ly 

by  a  summation  of  the  predicted  part  failure  rates.    Since  failure  rates 

are  frequently  displayed  as  percent  failures  per  1,000  hours,  the 

method  for  dateralnlng  MTBF  Is: 

0  -  10^ 

^1*2  *   3   *  n 
The  table  Indicating  part  typat,  nominal  operating  levels  and  failure 

rates  for  the  airborne  module  under  consideration  is  shown  :m  the  follow^ 

Ing  page.    From  this  table  It  can  be  detemnined  that  HTBF  will  be: 

•  •  10^  -  15,^56  hours 

e.kj  o      o      o  o 

Using  Increased  anblant  temperatures  of  10  ,  20  ,  30   and  ^0   above  nor- 

mI  It  was  possible  to  predict  MTBF's  of  the  module  at  increased  temper- 
ature levels.    The  failure  rate  data  was  extrapolated  from  the  failure 
curves  given  for  anvlronnants  of  military  airborne,  canroercial  airborne 
and  ftxad  ground  units.    The  NTBP's  computed  at  each  of  these  lovals  It 
Indicated  below: 

Nominal  ♦lO**  ^20**  ♦SO**  +kQ^ 
Military  Airborne  15,456  11,910  7.710  5.351  3.^56 
Conaarcial  Airborne  2kM0  23.^20  14,918  11.210  8.610 
Fixed  Ground  42,800      34,000     24,100        16.050  10,340 
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SYSTEM  FAILURE  RATE  DETERMINATION 


P«rt  Type 

Application  Factors 

Fal lure  Rate 

QTY 

Total 

Etactrotytic 
C^cltors 

Body  Terop*65  C 

.125 

2 

.250 

Ceramic 
Capacitors 

Body  Temp-40  C 
Voltage-. 6  Ratad 
Voltage-. 8  Ratad 

 1— 

i057 
.0<»6 

Paper 
Capacitors 

Body  T«mp-45  C 
Voltage-. 5  Rated 

,  „l-„. 

.054 

Oermanlun 
Diodes 

Junction  Tanp-50  C 
Junction  Tanp-75  C 

.110 

2 

I  , 

.220 

Chokes 

Hot-Spot  TfW*$9  P 

.152 

} 

Gen  Purpose 
Relays 

Contact  Load*. 5  Rated 
Duty  Cycle  3/Hour 
No.  Contact-2  sets 

.27 

1 

Iwi re*yound 

l«^<^slsvwrj  , 

Body  Temp- 100  C 
lodv  T«np-125  C 

.0027 

2 

1 

.005 
.008 

Audio 

Transforraers 

Not-^t  T«np*70  C 
Insulation  Class-B 

,fV 

il75 

.Composi  tion 
Resistors 

Body  Temp-i»0  C 
Body  Tei^>-55  C 
Body  T«mp-65  C 

IMy  im>''^o  C 

.0^8 
.OlOt 
.013 

5 
k 
1 

1 

.034 
.0<»l 
.026 
.020 

Miniature 
TufeM 

Bulb  Temp- 100  C 
Power  .6  Rated 
Heater  Voltage  .9 
Bulb  Terop-lSO  C 
Power  Ratad 
Hattar  Voltaoa  .9 

.835 
2.262 

1 

1 

1.670 
2.262 

Composition 
Potentiometers 

N/A 

.033 

2 

.066 

Total  Electrical  Part  Failure  Rate 

6.410 

tfach  Parts 

.012 

$ 

.060 

Total  Equipment  Fai lure  Rata 

m 

6.470 

( 


The  data  compiled  was  fed  Into  the  computer  in  an  affort  to  detemina 

the  function  rupresantlng  this  relationship,    if  it  is  assumed  that  the 

exponential  distribution  is  continuous  for  aii  environments  then  an 

acceleration  factor  (K)  can  be  determined  by: 

K  »  91 
•l 

MiMra  •!  "  MTBF  for  environment  i 
and  «j  "  MTBF  for  environment  J 
Therefore,  testing  at  kO*^  above  nominai  on  military  airborne  equipment 

The  function  developed  by  the  computer  for  this  airborne  module  was  the 
foi lowing: 

•  -  .99875  (T)*  -  329.5T 
wbara  •    MTBF  and  T  *  degrees  centigrade  kayend  nominal  oparatlnti 

temperature.    From  the  gr^hs  beiow  it  can  be  seen  that  the  various 
thanMl  strossM  produce  a  fanlly  of  curves. 


fo  ooo 

Zo  ooo 


10  ooo 


MTBF 


iOOO 


fixed  Ground 

Commercial 
I  Airborne 


^Military 
Airborne 


Degrees  diove  nominal  operating  temperature 
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As  d«serlbad  pravlously,  if  all  assumptions  are  valid,  actual  KTBF 
should  fall  within  the  90  percmt  CI  of  the  predicted  HTBF.    Using  these 
curves  and  their  related  environmental  conditions  the  data  collected  can 
now  i>a  (tted  to  lltuttrate  the  pradlction  of  NTBF  for  a  particular  place 
of  electronic  equipoMnt.    if  the  steps  as  outlined  in  this  report  are 
followed,  the  listing  of  componant  parts  and  their  envlronnental  con* 
ditions  must  be  coropiled.    From  this  list,  and  a  icnowladga  of  nominal 
operating  tsmparatures ,  It  Is  possible  to  determine  the  failure  rates 
at  any  tesiperature  level  within  CH>eratIng  limitations.    For  the  device 
In  question,  NTBF  at  use  conditions  was  predicted  to  be  15,^56  hours. 
From  the  same  failure  rate  data  it  was  predicted  that  the  HTBF  at  kO^ 
above  nonlnal  tmperatura  was  3*^56  hours.    The  plot  of  these  NTBf 
versus  temperature  levels  above  nominal  range  j^proximate  a  linear  re» 
iationshlp.    If  this  can  be  assuned,  the  method  for  determining  accalo 
eration  factors  as  described  earlier  in  this  report  can  be  appliad* 

in  order  to  verify  the  dMva  laatheaiatlcai  analysis,  it  would  be 
asMntial  to  actually  test  the  proposed  systaa  using  the  step  stress 
technique  at  various  tncraased  t«|NM«tura  levels.    Using  a  sufficient 
number  of  samples  to  substantiate  actual  failure  rates,  the  NTBF*s 
can. be  detennined  by  Hethod  2  described  In  this  report.   Assuming  the 
actual  NTBF's  correlate  with  the  predicted  NTBF's  at  all  temperature 
levels  it  would  be  safe  to  say  that  the  procedure  of  extracting  part 
failure  rates  at  Increased  stress  levels  to  determine  NTBF  of  a  sub- 
system at  use  conditions  is  feasible. 

The  electrolytic  capacitors  limit  this  particular  system  from  being 
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tMtad  at  •  higher  stress  level,  but  other  systems  could  be  tested  at 
nuch  higher  strata  Iwels.   Nomatly,  the  failure  rate  data  inforaation 
will  Indicate  a  maximum  operating  stress  condition. 

Fran  the  tlise  to  failure  at  Increased  stress  level,  confidence 
liMlts  could  be  predicted  for  the  nodule  at  use  conditions  as  described 
earlier  In  this  report,    if  the  system  was  of  a  new  design  Md  the 
mathematical  model  was  extremely  conplex,  the  method  of  step  stress 
tMtIng  et  Increased  ten^>erature  levels  could  be  used.    Assuming  that 
the  characteristics  of  the  components  were  not  exceeded  a  systen  MTBF 
curve  could  be  established  and  a  relationship  determined.    Llice  systens 
should  have  similar  MTBF's  and  consequently  these  should  be  predictable 
at  various  use  conditions. 

CONCLUSION 

The  conclusion  arrived  at  In  this  study  Is  that  accelerated  testing 
of  electronic  sytteM  appears  unfeasible.   For  the  particular  test  run 
In  this  study*  Mfiy  attunptlons  were  made  that  very  setden  hold  true  In 
light  of  the  complexity  of  aodem  electronic  systems.    The  possibility 
that  there  Is  no  Interaction  between  component  parts  or  that  systens  can 
be  broicen  down  into  independent  sub-systems  seems  highly  improbable. 
Although  tenperature  tuas  used  as  the  primary  stress  in  this  study.  It  Is 
well  known  that  various  components  react  more  vigorously  to  other  en- 
vironmental stresses.    Most  authors  will  agree  that  the  many  different 
environmental  stresses  encountered  during  various  phases  of  the  life  of 
an  equipnent  are  practically  Impossible  to  completely  simulate.  The 
study  carried  on  in  this  report  assumed  one  constant  environmnt  and  did 
not  include  any  deteraination  of  off  duty  failure  rates.    Another  im* 
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important  consideration  concerning  failure  distributioM  wtt  Includwl 
In  the  study  to  coropare  more  complex  functions  with  the  exponential. 
One  of  the  roost  critical  factors  in  carrying  out  a  prediction  study  from 
accelerated  stresses  pertains  to  the  maximum  operating  limitations  of 
the  component  parts.    Parts  manufacturers  all  agree  that,  if  their  com- 
ponents are  used  above  the  ^eclfled  limits,  failure  rates  can  not  be 
predicted.    With  this  limitation,  stress  levels  often  can  not  be  raised 
enough  to  be  truly  effective.    Another  problem  encountered  In  verifying 
tlMl  Mthod  propMed  tn  this  study  would  be  the  difficulty  in  piecing  tiM 
CMM  of  failure  on  a  particular  part.   Ninlaturation  of  today's  elec* 
tronic  systmas  creates  an  almost  Impossible  talk  of  monitoring  degrada- 
tion rates. 

As  pointed  out  in  the  study,  temperature  mid  atmosphere  control 
must  be  carefully  controlled  even  in  a  simple  test.   All  other  en* 
vironmentai  conditions  and  stresses  would  also  have  to  be  carefully 
monitored  to  avoid  the  inclusion  of  erroneous  data.   A  possible  solution 
would  be  the  use  of  computer  simulation  techniques  through,  a  matrix 
type  study  of  various  stresses.    If  %mBp}9»  of  electronic  systems  can 
then  be  tested  and  results  compared  with  simulation  techniques,  • 
possible  solution  may  evolve. 
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The  prtMary  objective  of  the  study  carried  on  In  the  preparation 
of  this  report  Mas  to  provide  an  assessment  of  the  feasibility  of  pre- 
dicting the  reliability  of  electronic  systems  through  the  analysis  of 
failure  rates  under  accelerated  environmental  conditions.    The  report 
contains  four  major  sections  describing  (I)  some  basic  fundamentals  of 
reliability  theory,  (2)  methods  of  predicting  reliability,  (3)  various 
types  of  accelerated  testing,  and  {k)  a  typical  example  of  predicting 
th«  Maan  time  between  failures  of  an  airborne  electronic  system.  Tha 
specific  intent  of  the  latter  item  vms  to  validate  procedures  for  de* 
taralning  tha  mean  time  between  failures  of  an  electronic  system  under 
accelerated  thermal  stresses.  In  an  effort  to  predict  the  moan  time 
between  failures  of  the  system  under  use  conditions. 

The  conclusion  arrived  at  in  this  study  was  that  the  accelerated 
testing  of  large  scale  systems  appears  unfeasible  because  of  the  com- 
plexity  of  the  systems  and  the  inaccuracy  connected  with  breaking  syt* 
terns  into  sub*>systems.    Although  tes^erature  was  used  as  the  dominating 
stress  in  this  study.  It  Is  known  that  various  components  react  to  other 
stresses  more  vigorously  and  that  all  components  do  not  fail  exponentially. 
Different  failure  distributions  were  discussed  to  provide  an  indication 
of  the  complexity  of  developing  a  mathematical  model  to  apply  to  a 
system.    Another  problem  area.  In  testing  complete  systems,  stems  from 
the  fact  that  certain  elements  such  as  electrolytic  capacitors  have 
maximum  temperature  operating  ranges  quite  low.    This  would  prohibit  any 
testing  at  levels  high  enough  to  be  effective.    With  the  miniaturatlon 
of  today's  electronic  systems,  responsibility  for  failure  would  ba 


extranely  difficult  to  pln*potnt.  Interaction  of  dwices  at  accalaratad 
conditions  would  have  to  be  thoroughly  studied  and  a  naans  of  monitoring 
each  individual  component  Mould  have  to  be  established. 
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